Code
# Verify in R
n <- 23
days_in_year <- 365
prob_all_different <- prod((days_in_year - 0:(n - 1)) / days_in_year)
prob_match <- 1 - prob_all_different
prob_match
# [1] 0.5072972Martin Schweinberger
January 1, 2026


This tutorial introduces the foundations of quantitative reasoning and scientific thinking. It asks a deceptively simple question: why can we not simply observe the world carefully and reason from what we see? The answer — that human perception and cognition are systematically biased in ways that evolution has shaped but that our research goals require us to overcome — provides the motivation for the entire scientific enterprise.
The tutorial covers cognitive biases that affect how we perceive patterns, probability, and causation; logical fallacies that undermine valid reasoning; the philosophical foundations of the scientific method including Karl Popper’s theory of falsification; and what it means to apply scientific thinking to linguistics and to everyday claims about the world.
By the end of this tutorial you will be able to:
This tutorial assumes no prior knowledge of statistics or research methods. It is designed as a first step and does not require completion of any earlier tutorial. Readers who want to build directly on this foundation may proceed to:
Martin Schweinberger. 2026. Introduction to Quantitative Reasoning: Why We Need Science. The Language Technology and Data Analysis Laboratory (LADAL), The University of Queensland, Australia. url: https://ladal.edu.au/tutorials/introquant/introquant.html (Version 2026.05.01).
What you will learn: Why pure logical reasoning cannot answer empirical questions; why careful observation alone is insufficient; and how human cognition is systematically biased in ways that make a disciplined scientific methodology necessary.
Before addressing why science is necessary, it is worth establishing what it is.
Science is a methodological process used to acquire knowledge about the world based on empirical evidence.
The key components are:
For some domains, reasoning alone works well. The formal sciences — logic and mathematics — proceed entirely through deduction:
Premise 1: Socrates is a human being
Premise 2: All humans are mortal
Conclusion: Therefore, Socrates is mortal
If the premises are true and the logic is valid, the conclusion must be true. No observation of Socrates is required.
The problem is that logic cannot tell us which possible world is our world. Consider three equally coherent possibilities:
Possible world 1: I raise my left arm after counting to 3
Possible world 2: I raise my right arm after counting to 3
Possible world 3: I raise neither arm after counting to 3
All three are logically possible. To know which one actually happened requires empirical evidence — observation of what occurred. (For the record: I counted to two and raised neither arm.)
If we need evidence, why not simply observe the world attentively? Because human beings are systematically biased observers. The remainder of this tutorial demonstrates this problem in detail.
What we fear is often not what actually harms us. Two widely cited contrasts illustrate this:
Strangers versus known contacts. Our fear of strangers — sometimes called “stranger danger” — is vivid and pervasive. Yet the evidence consistently shows that most violence against children and adults occurs within families and among known contacts, not from strangers. The fear is misplaced, and the misplacement has real costs in how we direct protective attention.
Sharks versus cows and mosquitoes. Shark attacks are dramatic and memorable, and have been amplified by popular culture. Yet in the United States, cows kill roughly 20 people per year while sharks kill fewer than one on average. Globally, mosquitoes cause around 700,000 deaths annually through disease transmission. The asymmetry between fear and statistical risk is striking.
The explanation is that vivid, emotionally charged narratives override statistical information. Evolutionary pressures favoured quick emotional responses to salient threats over careful actuarial reasoning.
Confirmation bias is the tendency to seek out, interpret, and remember information in ways that confirm what we already believe, while ignoring or discounting contradictory evidence.
This bias is both pervasive and insidious: it affects experts as much as novices, operates even when we are trying to be objective, and reinforces existing beliefs — including incorrect ones — rather than correcting them. We will demonstrate it directly with the Wason Selection Task and the Number Sequence Puzzle in Part 3.
Most people are surprised by how consistently wrong their intuitions are when it comes to probability and statistics. Two classical demonstrations make this vivid.

Monty Hall hosted the American television game show Let’s Make a Deal. The game works as follows:
Think about this carefully before reading on. Most people have a strong intuition about the answer.
The intuitive answer is that it does not matter — there are now two doors remaining, so the probability must be 50-50. This is incorrect.
You should always switch. Switching gives you a 2/3 probability of winning; staying gives you only 1/3.
When you initially chose Door 1, you had a 1/3 chance of being right. Doors 2 and 3 together held a 2/3 chance of hiding the prize.
When Monty opens Door 3 (always revealing a goat, because he knows where the prize is), that 2/3 probability does not disappear — it concentrates entirely onto Door 2. Door 1 still has only its original 1/3 probability.
| Door | Before Monty opens Door 3 | After Monty opens Door 3 |
|---|---|---|
| Door 1 (your choice) | 1/3 | 1/3 |
| Door 2 | 1/3 | 2/3 |
| Door 3 | 1/3 | 0 (revealed as goat) |
The key insight is that Monty’s action is not random — he always opens a losing door. That constraint is what transfers probability.
A more transparent version: 20 doors. Imagine 20 doors instead of 3. You pick Door 1 (1/20 chance of winning). Monty then opens 18 doors, all revealing goats, leaving one other door closed. Would you switch? Almost everyone would — it is obvious that the 19/20 probability has concentrated onto that one remaining door. The logic with 3 doors is identical, just less intuitively obvious.
You can verify this empirically using an online Monty Hall simulation. Running 100 trials with each strategy consistently produces roughly 33% wins when staying and 67% wins when switching.
How many people need to be in a room for there to be a 50% chance that two of them share a birthday? Think about your answer before reading on.
Most people guess something around 100 or even 183 (half of 365). The correct answer is only 23. With 23 people, the probability that at least two share a birthday is 50.7%.
The calculation is most easily approached by computing the complement — the probability that all 23 people have different birthdays:
Person 1: 365/365 (any birthday is fine)
Person 2: 364/365 (must differ from person 1)
Person 3: 363/365 (must differ from persons 1 and 2)
...
Person 23: 343/365 (must differ from all 22 others)
P(all different) = (365 × 364 × 363 × ... × 343) / 365^23
= 0.4927
P(at least one match) = 1 - 0.4927 = 0.5073
With 73 people, the probability of a shared birthday exceeds 99.999%.
The lesson is that we systematically underestimate how quickly probabilities accumulate — particularly with combinatorial calculations. We are reasonably good at linear arithmetic but very poor at reasoning about exponential growth and compound probabilities. This is one of many reasons why statistical analysis cannot be replaced by intuition.

A ball and a bat together cost $1.10. The bat costs $1.00 more than the ball. How much does the ball cost?
Most people immediately answer “10 cents.” This is wrong. If the ball costs 10 cents, the bat costs $1.10, and the total is $1.20 — not $1.10.
The correct answer is 5 cents: ball = $0.05, bat = $1.05, total = $1.10.
The psychologist Daniel Kahneman distinguishes two modes of cognition (Kahneman 2011):
System 1 (fast thinking) operates automatically and effortlessly. It generates intuitive responses based on pattern recognition and association. It is fast and requires no conscious effort — but it regularly produces errors on problems that require careful reasoning.
System 2 (slow thinking) is deliberate, effortful, and analytical. It applies logical rules and checks its own work. It is more reliable but requires cognitive effort that we often avoid expending.
The ball and bat problem shows System 1 in action: it generates “10 cents” almost instantly because the numbers $1.00 and $0.10 are salient and combine to give a plausible total. System 2, if engaged, immediately detects the error — but System 1 answers first and System 2 tends to be lazy about checking plausible-seeming answers.
Science can be understood as a set of institutional and methodological procedures designed to force deliberate, effortful, System 2 reasoning. Peer review, pre-registration, replication, controlled experiments, and statistical testing are all mechanisms for preventing the fast, intuitive, and frequently wrong conclusions of System 1 from being accepted as knowledge. Science is expensive in time and effort — but it produces more reliable knowledge precisely because of that cost.
In a classic experiment, B. F. Skinner (1948) placed pigeons in boxes where food was delivered at random intervals, with no connection to anything the pigeon did. The result was that each pigeon developed idiosyncratic repetitive behaviours — one turned in circles, another pecked at corners of the box — which it had happened to be performing when food arrived by chance.

The pigeons had assumed a causal connection between their behaviour and the food reward, even though the delivery was entirely random. Each accidental co-occurrence reinforced the behaviour, creating what Skinner called “superstitious” conditioning.
Human superstitions operate by the same mechanism. Athletes who perform well while wearing a particular item of clothing begin treating that item as a causal agent. Gamblers develop “systems” based on perceived patterns in random sequences. In all cases, the cognitive machinery evolved to detect genuine patterns in the environment applies itself inappropriately to random co-occurrences.
Why this matters for research: The same tendency that creates superstition in pigeons and humans can create false patterns in data. If you run enough analyses on a dataset, some will produce significant results by chance alone. This is one reason why hypotheses should be specified before data collection (pre-registration), not inferred from the data retrospectively.
Pareidolia is the perception of meaningful patterns — especially faces — in random or ambiguous stimuli. Famous examples include the “Face on Mars” photographed by Viking 1 in 1976 (later shown to be an ordinary rock formation under different lighting), apparent religious figures in food burn marks, and the “Man in the Moon” (with different cultures perceiving different figures in the same lunar surface).
The evolutionary explanation (Bruce Hood, Cardiff University) is straightforward. The ability to quickly detect faces — and particularly to distinguish friend from foe, safe from threatening — was highly adaptive. The cost of a false negative (failing to detect a real face when one is present) was potentially severe: missing a predator or failing to recognise an enemy. The cost of a false positive (seeing a face where there is none) was low: a momentary misperception with no lasting consequence. Evolution therefore favoured an over-sensitive face-detection system, and we inherit the result.
A professor offers you $10 to wear a sweater for one minute. Would you accept?
Most people would. Now consider an additional detail: the sweater previously belonged to a convicted serial killer. Does this change your answer?
Many people become reluctant, or feel discomfort even if they would still accept. Rationally, the sweater is just cloth — its history carries no physical trace that could harm the wearer. Yet the feeling of contamination is real and difficult to dismiss by reasoning.
The evolutionary explanation mirrors that for pareidolia. Ancestors who avoided objects associated with disease, death, or dangerous individuals were at a genuine survival advantage — contaminated objects can carry pathogens. The emotional response of disgust and avoidance was adaptive. Today, that same response activates in contexts where it no longer makes adaptive sense but where we inherited the tendency nonetheless.
Anthropocentric bias (sometimes called experiential realism) is the assumption that the world appears to all organisms as it appears to us — that our perceptual experience constitutes, rather than merely filters, reality.
Consider human versus bee vision. Humans perceive light in the wavelength range of approximately 400–700 nanometres. Bees perceive roughly 300–650 nm, which includes ultraviolet light but excludes red (which appears black to them). The practical consequence is that flowers look dramatically different to bees than to us: many flowers have ultraviolet patterns that guide bees to nectar but are completely invisible to human eyes.

The philosopher-linguists Evans and Green put this well:
“However, the parts of this external reality to which we have access are largely constrained by the ecological niche we have adapted to and the nature of our embodiment. In other words, language does not directly reflect the world. Rather, it reflects our unique human construal of the world: our ‘world view’ as it appears to us through the lens of our embodiment.”
— Evans and Green (2006, 46)
The implications for research are significant. Any science that takes human perception as a transparent window onto reality — rather than as one evolved, partial, species-specific perspective on it — will systematically reproduce the biases of that perspective. This is a further argument for why we need systematic, instrument-mediated, and community-checked science rather than just careful personal observation.


What happens: After staring at the red square, the left portion of the dunes appears greenish. After staring at the green square, the right portion appears reddish.
Why: The red-sensitive and green-sensitive photoreceptors in your retina become temporarily fatigued (depleted of neurotransmitter). When you look at the neutral sand, the fatigued cells fire less strongly, so the complementary colour dominates. What you “see” is not simply what is there — it is the output of a neurophysiological process that is itself subject to fatigue, context, and prior stimulation.
Gestalt psychology (from the German word for “form” or “shape”) studies how we perceive unified wholes from collections of parts. Several classic principles demonstrate that perception is an active, constructive process, not a passive recording of stimulation.

The Kanizsa triangle above contains no actual triangle — there are three Pac-Man shapes and three angle markers. Yet virtually everyone perceives a bright white triangle overlaying the other elements. The brain constructs the missing contours from partial information, using the principle of closure (completing incomplete shapes).
When the same elements are rearranged, the triangle disappears and three Pac-Man shapes appear instead:

Same elements, different arrangement — radically different perception. Other Gestalt principles include proximity (nearby items are perceptually grouped), similarity (similar items are grouped), continuity (smooth lines are preferred over sharp changes), and common fate (items moving together are grouped).
All of these principles demonstrate the same point: perception is not a record of the external world but a construction that the brain generates based on partial information, prior expectations, and evolved heuristics.

Look at the two upside-down faces above. One may seem slightly unusual, but both appear roughly human and recognisable.
Now look at the same images right-side-up:

The distortion — eyes and mouth inverted relative to the face — that was barely noticeable upside-down is now grotesque and immediately obvious.
Why: When a face is inverted, the brain does not deploy its specialised face-processing system; it processes the image as a generic object. Local distortions go unnoticed. When the face is right-side-up, the full face-processing architecture activates, and the mismatch between the expected face template and the actual distorted image is immediately detectable. Context (orientation) determines which perceptual processing system is recruited, and that choice determines what we see.
The ambiguous figure below illustrates how context determines categorical perception:

The middle symbol in the alphabetic sequence A, B, C is typically read as the letter “B.”

The same symbol in the numeric sequence 12, 13, 14 is typically read as the number “13.”
The physical stimulus is identical in both cases. What changes is the context, which activates different prior expectations and determines which categorical interpretation the perceiver reaches. The same stimulus produces different perceptions depending on its context. This has direct implications for linguistics: the same linguistic form can carry different meanings in different contexts, and we cannot study meaning without studying context.
Q1. The Monty Hall problem reveals a systematic failure of probabilistic intuition. The core of the correct solution is that Monty’s action is not random. Which statement best captures why this matters?
Q2. Pareidolia and Skinner’s pigeon experiments both illustrate the same underlying cognitive tendency. What is it?
What you will learn: The most common logical fallacies encountered in academic discourse, media, and everyday argumentation — what they are, why they are fallacious, and how to recognise and counter them.
A logical fallacy is a pattern of argument that appears persuasive but contains a fundamental flaw in reasoning. Logical fallacies are not merely weak arguments — they are systematically invalid in a way that can be precisely identified.
Recognising logical fallacies matters because they are pervasive in public discourse, because everyone is susceptible to them (including trained researchers), and because they prevent accurate conclusions and undermine rational debate. Being able to name and explain a fallacy is not merely an academic exercise: it is a practical tool for evaluating claims.
What it is: Selectively seeking out, reporting, or emphasising evidence that supports a preferred conclusion while ignoring or discounting contradictory evidence.
Example:
Claim: "Vaccines cause autism!"
Evidence cited: 1 study (subsequently retracted for scientific fraud) that found a link
Evidence ignored: 100+ subsequent independent studies that found no link
Why it is a fallacy: The strength of evidence lies in its totality, not in the existence of at least one supporting study. Every scientific question can find at least one study pointing in any direction; what matters is the weight and quality of the full body of evidence.
Scientific solution: Pre-register analysis plans before collecting data; report all results including negative ones; conduct systematic reviews and meta-analyses that pool evidence across studies.
What it is: Attacking the character, credentials, or motives of a person making an argument rather than addressing the argument itself.
Examples:
Why it is a fallacy: A person’s character, political affiliation, or funding source does not determine whether their argument is logically valid or their evidence reliable. These are separate questions. An argument must be evaluated on its own merits.
Correct approach: Identify specific methodological or logical flaws in the argument itself. If funding bias is a concern, examine whether the methods and conclusions are appropriate — not whether the funding source is ideologically convenient.
What it is: Citing a person’s authority or expertise as the sole justification for accepting a claim, without engaging with the evidence or reasoning behind it.
When it is not a fallacy: Citing a researcher’s work in the sense of engaging with their evidence and methods is entirely appropriate. “According to Smith et al. (2020), who found X using method Y…” is legitimate evidence-based reasoning.
When it is a fallacy:
Key distinction: An authority’s evidence and reasoning can be cited as support; an authority’s opinion alone is not evidence.
What it is: Misrepresenting an opponent’s position — usually by exaggerating or oversimplifying it — in order to attack the weaker, distorted version rather than the actual argument.
Example:
Person A: "We should have some regulations on firearms to reduce violence."
Person B: "You want to ban all guns and leave people completely defenceless!"
Person A said nothing about banning all guns. Person B has constructed a distorted version (“straw man”) of the argument because it is easier to defeat than the actual position.
Why it is called “straw man”: A straw man is easy to knock down, unlike a real opponent. Winning against a straw man creates the appearance of having refuted the real argument without having engaged with it.
What it is: Claiming that a proposition is true because it has not been proven false (or vice versa). Treating absence of evidence as evidence of absence — or, more commonly in practice, as evidence of presence.
Examples:
Why it is wrong: Absence of evidence is not, in general, evidence of absence. There are many things that have not yet been investigated. The appropriate response to insufficient evidence is to remain agnostic — to say “we do not yet know” — not to fill the gap with a preferred explanation.
Correct reasoning: Maintain that the burden of proof lies with the person making the positive claim. Absence of disproof does not confirm the claim; it merely leaves it untested.
What it is: Presenting a situation as though only two options exist, when in fact more are available — typically by framing the two extreme positions as the only possibilities.
Examples:
Why it is manipulative: It forces a choice between extremes, eliminates middle ground and compromise, and polarises discussion by making nuanced positions invisible.
What it is: Claiming that one action will inevitably lead, through a chain of steps, to an extreme and undesirable outcome — without providing evidence that the causal chain would actually operate.
Examples:
When it is legitimate: When there is actual evidence that each step in the chain follows predictably from the previous one, a slope argument may be valid. The fallacy lies in asserting the chain without that evidence.
When it is a fallacy: When the argument relies on fear of an extreme outcome rather than on evidence that the intermediate steps are likely.
What it is: An argument in which the conclusion is already contained in, or assumed by, one of the premises. The argument appears to provide evidence for its conclusion but actually just restates the same claim in different words.
Examples:
Why it fails: No new information is added. If you accept the premise, you have already accepted the conclusion. The argument provides no independent reason to believe the conclusion is true.
Valid structure: Independent premises lead through explicit reasoning to a conclusion that was not already assumed in the starting point.
What it is: Introducing irrelevant information to distract from the actual question or issue under discussion.
Example:
Journalist: "Why did the government waste millions on this failed project?"
Politician: "Let me tell you about all the great schools we have built.
Education is so important, do you not agree?"
The politician has not addressed the question of the waste. Instead, they have introduced a different — and more politically comfortable — topic.
Why it works: People naturally follow new conversational directions, and the original question is easy to lose track of, especially in spoken discourse.
What it is: Continuing to invest resources (time, money, effort) in something because of what has already been invested, even when the future expected costs outweigh the future expected benefits.
Examples:
Why it is irrational: Past costs are irretrievable. They cannot be recovered and are therefore irrelevant to the decision about what to do next. The only rational question is: given the current situation, do the expected future benefits outweigh the expected future costs?
Rational approach: Evaluate each decision forward-looking only. Ask: if I were starting from scratch with no prior investment, would I begin this? If no, the sunk cost fallacy may be operating.
Without awareness of logical fallacies, researchers and readers reach wrong conclusions, waste resources, defend indefensible positions, and spread misinformation — even in good faith.
Science provides the institutional antidote: peer review catches ad hominem and cherry-picking; pre-registration counters confirmation bias; the requirement to engage with the strongest version of opposing theories counters straw man arguments; and the norm of reporting negative results counters selective reporting.
Recognising fallacies in one’s own thinking is harder than recognising them in others’ — but it is the more important skill.
You see four cards. Each card has a letter on one side and a number on the other side. The visible faces are:
Card 1: A Card 2: K Card 3: 2 Card 4: 7
The rule: “If there is a vowel on one side of a card, then there is an even number on the other side.”
Think carefully. You need to choose the minimum set of cards that could definitively falsify the rule.
The most common answer is Cards 1 and 3 (A and 2). This is incorrect.
The correct answer is Cards 1 and 4 (A and 7).
What this demonstrates: Most people turn over cards that confirm the rule (vowel, even number) rather than cards that could falsify it (vowel?, odd number). This is confirmation bias operating in a purely logical context. Scientific thinking requires actively seeking evidence that could prove you wrong, not just evidence consistent with your hypothesis.
Here are three numbers that follow a rule I have in mind:
1 2 4
You may propose one additional number, and I will tell you whether it follows my rule. What number would you choose, and what rule do you hypothesise?
What is your hypothesis about the rule? What number would best test it?
Typical responses: Most people guess 8 (following the “doubling” hypothesis) or 16 (following the “squaring” hypothesis). Both follow the rule — but neither tests whether the hypothesis is correct.
A better strategy: Propose a number that your hypothesis predicts would not follow the rule — say, 3 or 7 or 10. If the rule is “each number is larger than the previous one” (which it is), then 3 would follow the rule, falsifying the doubling hypothesis.
The actual rule: “Each number must be larger than the previous number.”
What this demonstrates: Confirmation bias again. When people think their hypothesis is “doubling,” they propose numbers that would confirm it rather than numbers that would challenge it. But only by testing the boundaries of your hypothesis — by attempting to falsify it — can you distinguish your hypothesis from the many other hypotheses compatible with your initial evidence.
Q3. A politician responds to a question about rising crime rates by saying: “I am proud of the new schools this government has built, and education is the foundation of a safe society.” Which fallacy does this illustrate?
Q4. A friend argues: “I have watched five seasons of this series and it has been mediocre throughout. I might as well finish it — I have already invested 40 hours.” What is the flaw in this reasoning?
What you will learn: A comprehensive definition of science; the distinction between empirical and formal sciences; the scientific method as a cycle of hypothesis-testing; the Clever Hans case study as a demonstration of why methodology matters; and Popper’s principle of falsification as the criterion that separates scientific from non-scientific claims.
The working definition given in Part 1 can now be made more precise:
Science is an unbiased, fundamentally methodological enterprise that aims at building and organising knowledge about the empirical world in the form of falsifiable explanations and predictions, by means of systematic observation and experimentation.
The key components are:
Empirical sciences examine phenomena of reality through the scientific method. Their goal is to explain and predict what actually exists and occurs. Examples include biology, physics, chemistry, psychology, sociology, and linguistics. Their method involves observing reality, forming and testing hypotheses, and refining theories in response to evidence.
Formal sciences examine abstract systems through axiomatic reasoning. Their goal is logical coherence and internal consistency. Examples include mathematics, formal logic, theoretical computer science, and formal linguistics. Their method involves starting from axioms, applying logical operations, and deriving theorems. Crucially, formal sciences can prove their results — because their claims concern abstract objects defined by their own axioms.
The key difference is epistemological: formal sciences can establish truths by logical proof; empirical sciences cannot prove — they can only test and potentially falsify. As Popper showed, this asymmetry between proof and falsification is fundamental to understanding how science works.

Science does not proceed in a straight line from observation to truth. It is a cycle of hypothesis formation, testing, revision, and renewed testing — a continuous self-correcting process.
The basic steps are:
The abstract steps become concrete with a trivial everyday example:
Observation: My keys are missing.
Question: Where are my keys?
Literature: I have left them on the TV table before.
H₁: My keys are on the TV table.
H₀: My keys are NOT on the TV table.
Design: I will check the TV table.
Data: I checked — no keys there.
Analysis: H₀ cannot be rejected.
Conclusion: My keys must be elsewhere.
New H₁: My keys are in my coat pocket.
[Repeat]
This trivial example captures the logic that applies to the most sophisticated experiments.

Between 1891 and 1904, a horse named Clever Hans became famous across Europe for apparently being able to perform arithmetic, answer questions in German, spell words, and tell the time. His owner, Wilhelm von Osten, would ask questions and Hans would tap his hoof the correct number of times. Multiple scientific commissions investigated and found no evidence of fraud. Von Osten appeared to genuinely believe in his horse’s abilities.
The psychologist Oskar Pfungst (1907) took a more systematic approach. He designed controlled experiments varying two factors: whether the questioner knew the correct answer, and whether Hans could see the questioner.
| Condition | Result |
|---|---|
| Questioner knows answer + Hans can see questioner | Hans answers correctly |
| Questioner does not know answer | Hans cannot answer |
| Hans can see questioner | Hans answers correctly |
| Hans cannot see questioner (blinders) | Hans cannot answer |
The pattern was unambiguous: Hans’s performance depended entirely on whether he could see someone who knew the answer.
Pfungst found that questioners unconsciously provided micro-cues that Hans had learned to read. When asking a question requiring a numerical tap count, the questioner would unconsciously tense up; as Hans approached the correct number, the questioner would relax slightly. Hans had learned to start tapping at the tensing cue and stop at the relaxing cue — appearing to know the answer when he was actually reading involuntary muscle movements.
The term Clever Hans effect now refers to any situation in which an experimenter’s unconscious behaviour influences a subject’s responses, and serves as a reminder of why blinding and systematic methodology are not merely bureaucratic requirements but essential safeguards against self-deception.

The Austrian-British philosopher Karl Popper (1902–1994) identified a fundamental problem with the traditional view of science as proceeding from many observations to general laws:
Traditional view:
Observation 1: Swan 1 is white
Observation 2: Swan 2 is white
...
Observation 10,000: Swan 10,000 is white
↓
Law: All swans are white
Popper’s insight: No number of confirming observations can prove a universal generalisation true. No matter how many white swans you observe, the 10,001st swan might be black. And indeed, when Europeans arrived in Australia they encountered black swans — observations that immediately falsified the “all swans are white” generalisation that had seemed secure for centuries.
But notice the asymmetry: a single black swan is sufficient to refute the universal claim. While we cannot verify by accumulating positive evidence, we can — and must — test by seeking negative evidence.
A theory is scientific if and only if it is falsifiable.
A theory is falsifiable when it is possible to describe, in advance, what kind of observation would prove it wrong. Falsifiable theories take an empirical risk: they stake out a position that could be contradicted by evidence.
A theory that is compatible with every possible observation is not scientific — not because it is necessarily false, but because it cannot be tested and therefore cannot be part of the self-correcting process that constitutes science.
Falsifiable (scientific) examples:
Not falsifiable (not scientific) examples:
The last example points to Popper’s famous critique of psychoanalysis: Freudian theory, he argued, is structured so that any conceivable behaviour can be interpreted as confirming it. A patient who is close to their mother confirms the Oedipal hypothesis; a patient who is distant from their mother confirms it too (they are “repressing” their feelings). A theory that cannot be falsified by any evidence is not a scientific theory — even if it happens to be true.
Popper drew an analogy between science and biological evolution. In evolution, genetic variation is subjected to natural selection — variants that fit their environment survive; those that do not are eliminated. In science, theoretical variation (new hypotheses and conjectures) is subjected to empirical testing — theories that withstand attempts at falsification survive; those that do not are rejected. Both processes are progressive but not teleological: they eliminate what does not work without guaranteeing that what remains is final truth.
Implications for research practice:
Linguistics is the scientific study of language and individual languages. Linguists aim to uncover, describe, explain, and model the systems that underlie human language use.
As an empirical science, linguistics studies language through systematic observation of real language use, tests hypotheses about linguistic structure and function, and produces falsifiable claims about how language works.
Descriptive versus prescriptive linguistics illustrates the scientific/non-scientific distinction:
| Approach | Character | Example |
|---|---|---|
| Descriptive (scientific) | Describes what speakers actually do | “English speakers frequently use ain’t in casual conversation” |
| Prescriptive (non-scientific) | Prescribes what speakers should do | “You should not say ain’t” |
Prescriptive claims are not falsifiable in Popper’s sense — they are normative, not empirical. Descriptive claims can be tested against corpus data and thus belong to the domain of science.
Example: the scientific circle in linguistics
Observation: Children appear to learn grammar without explicit instruction.
Question: How do children acquire language?
Literature: Chomsky's Universal Grammar hypothesis;
Tomasello's usage-based approach.
H₁: Children extract grammatical patterns through frequency tracking.
Design: Expose children to artificial language with manipulated
input frequencies; record which patterns they learn.
Data: Children's productions; error patterns; learning rates.
Analysis: Compare learning rates for high- versus low-frequency patterns.
Conclusion: Higher frequency predicts faster acquisition —
supports usage-based hypothesis.
Refinement: Test with different age groups, complexity levels.
[Repeat]
Q5. A researcher proposes the theory: “Students who feel positively about their lecturer will perform better on written assessments.” Is this theory scientific in Popper’s sense?
Q6. A therapist argues: “If a patient denies having repressed childhood trauma, that itself shows how deeply it is repressed. If a patient acknowledges having difficult memories, that confirms the trauma theory.” What is the scientific problem with this argument?
What you will learn: How to apply the scientific method to real-world claims — including health claims, news reports, and unusual beliefs — and how to design a linguistics study from the ground up.
Given what we have covered, we can offer a scientific analysis of why people believe in ghosts — not a dismissal of those beliefs, but an explanation of the cognitive and perceptual mechanisms that generate such experiences in the absence of actual ghosts.
Several factors operate together:
Pareidolia and agency detection — the brain is primed to detect faces and intentional agents. In low light, in unfamiliar environments, or when anxious, ambiguous stimuli are more likely to be interpreted as presences.
Confirmation bias — people who believe in ghosts attend to and remember experiences that are consistent with that belief (unexplained sounds, feelings of being watched) and discount or forget the vast majority of experiences that have mundane explanations.
Sleep paralysis — during transitions in and out of REM sleep, it is possible to experience vivid hallucinations combined with an inability to move. This experience — including the sensation of a threatening presence in the room — is well-documented neurologically and has likely generated ghost and demon narratives across cultures.
Infrasound — sounds below the threshold of human hearing (below roughly 20 Hz) can produce feelings of unease, anxiety, and the sensation of an unseen presence. Old buildings with large resonant chambers sometimes produce infrasound.
Emotional factors — grief, sleep deprivation, and fear heighten the tendency to perceive meaningful patterns in ambiguous stimuli.
None of these explanations requires ghosts to exist. Together, they account for the full range of reported ghost experiences using well-understood mechanisms.
Claim: “Vitamin X cures cancer!”
Applying scientific criteria:
An anecdote about one person who took the vitamin and recovered is not evidence in the relevant sense — because people recover from cancer without the vitamin, and we have no way of knowing what would have happened without it.
Headline: “Study shows chocolate improves memory!”
Critical questions for any such claim:
Claim: “This quantum healing bracelet balances your body’s energy.”
Applying scientific analysis:
You want to investigate whether younger or older speakers of English differ in spoken fluency. How would you design this study scientifically?
Think through the following before reading the answer below.
When evaluating evidence or making decisions, watch for:
1. Observe → 2. Question → 3. Review literature →
4. Hypothesise (H₁ and H₀) → 5. Design → 6. Collect data →
7. Analyse → 8. Conclude → 9. Refine → [Repeat]
Key principles: falsifiable hypotheses; controlled observation; statistical analysis; peer review; replication.
Questions to ask of any empirical claim:
Martin Schweinberger. 2026. Introduction to Quantitative Reasoning: Why We Need Science. The Language Technology and Data Analysis Laboratory (LADAL), The University of Queensland, Australia. url: https://ladal.edu.au/tutorials/introquant/introquant.html (Version 2026.05.01), doi: .
@manual{martinschweinberger2026introduction,
author = {Martin Schweinberger},
title = {Introduction to Quantitative Reasoning: Why We Need Science},
year = {2026},
note = {https://ladal.edu.au/tutorials/introquant/introquant.html},
organization = {The Language Technology and Data Analysis Laboratory (LADAL), The University of Queensland, Australia},
edition = {2026.05.01}
doi = {}
}
R version 4.4.2 (2024-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8
[2] LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
time zone: Australia/Brisbane
tzcode source: internal
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] checkdown_0.0.13 tufte_0.13 cowplot_1.2.0 magick_2.8.5
[5] lubridate_1.9.4 forcats_1.0.0 stringr_1.6.0 dplyr_1.2.0
[9] purrr_1.2.1 readr_2.1.5 tidyr_1.3.2 tibble_3.3.1
[13] ggplot2_4.0.2 tidyverse_2.0.0 flextable_0.9.11 knitr_1.51
loaded via a namespace (and not attached):
[1] gtable_0.3.6 xfun_0.56 htmlwidgets_1.6.4
[4] tzdb_0.5.0 vctrs_0.7.2 tools_4.4.2
[7] generics_0.1.4 pkgconfig_2.0.3 data.table_1.17.0
[10] RColorBrewer_1.1-3 S7_0.2.1 uuid_1.2-1
[13] lifecycle_1.0.5 compiler_4.4.2 farver_2.1.2
[16] textshaping_1.0.0 codetools_0.2-20 litedown_0.9
[19] fontquiver_0.2.1 fontLiberation_0.1.0 htmltools_0.5.9
[22] yaml_2.3.10 pillar_1.11.1 openssl_2.3.2
[25] fontBitstreamVera_0.1.1 commonmark_2.0.0 tidyselect_1.2.1
[28] zip_2.3.2 digest_0.6.39 stringi_1.8.7
[31] labeling_0.4.3 fastmap_1.2.0 grid_4.4.2
[34] cli_3.6.5 magrittr_2.0.4 patchwork_1.3.0
[37] withr_3.0.2 gdtools_0.5.0 scales_1.4.0
[40] timechange_0.3.0 rmarkdown_2.30 officer_0.7.3
[43] askpass_1.2.1 ragg_1.5.1 hms_1.1.4
[46] evaluate_1.0.5 markdown_2.0 rlang_1.1.7
[49] Rcpp_1.1.1 glue_1.8.0 BiocManager_1.30.27
[52] xml2_1.3.6 renv_1.1.7 rstudioapi_0.17.1
[55] jsonlite_2.0.0 R6_2.6.1 systemfonts_1.3.1
This tutorial was revised and restyled with the assistance of Claude (claude.ai), a large language model created by Anthropic. All substantive content — examples, explanations, case studies, and reasoning — was retained from the original and reviewed and approved by Martin Schweinberger, who takes full responsibility for the tutorial’s accuracy.
---
title: "Introduction to Quantitative Reasoning: Why We Need Science"
author: "Martin Schweinberger"
date: "2026"
params:
title: "Introduction to Quantitative Reasoning: Why We Need Science"
author: "Martin Schweinberger"
year: "2026"
version: "2026.05.01"
url: "https://ladal.edu.au/tutorials/introquant/introquant.html"
institution: "The Language Technology and Data Analysis Laboratory (LADAL), The University of Queensland, Australia"
description: "This tutorial provides a conceptual introduction to quantitative reasoning and the scientific method, covering the logical foundations of empirical research, the history of quantitative thinking, and the philosophical underpinnings of data analysis. It is designed for researchers in linguistics and the humanities who are new to quantitative methods and want to understand the 'why' behind statistical approaches."
doi: "10.5281/zenodo.19332884"
format:
html:
toc: true
toc-depth: 4
code-fold: show
code-tools: true
theme: cosmo
---
```{r setup, echo=FALSE, message=FALSE, warning=FALSE}
library(checkdown)
library(ggplot2)
library(cowplot)
options(stringsAsFactors = FALSE)
```
{ width=100% }
# Introduction {#intro}
{ width=15% style="float:right; padding:10px" }
This tutorial introduces the foundations of quantitative reasoning and scientific thinking. It asks a deceptively simple question: why can we not simply observe the world carefully and reason from what we see? The answer — that human perception and cognition are systematically biased in ways that evolution has shaped but that our research goals require us to overcome — provides the motivation for the entire scientific enterprise.
The tutorial covers cognitive biases that affect how we perceive patterns, probability, and causation; logical fallacies that undermine valid reasoning; the philosophical foundations of the scientific method including Karl Popper's theory of falsification; and what it means to apply scientific thinking to linguistics and to everyday claims about the world.
::: {.callout-note}
## Learning Objectives
By the end of this tutorial you will be able to:
1. Explain why empirical evidence is necessary, and why pure logical reasoning is insufficient for knowledge about the world
2. Identify and describe the major cognitive biases that affect human reasoning — including confirmation bias, poor probabilistic intuition, pattern-seeking, pareidolia, and anthropocentric perception
3. Recognise and name at least ten common logical fallacies and explain why each undermines valid argumentation
4. Describe Popper's principle of falsification and explain what distinguishes scientific from non-scientific claims
5. Apply the scientific circle to a concrete research question
6. Evaluate everyday claims, health claims, and news stories using scientific criteria
7. Explain why linguistics is an empirical science and distinguish descriptive from prescriptive approaches
:::
::: {.callout-note}
## Prerequisite Tutorials
This tutorial assumes no prior knowledge of statistics or research methods. It is designed as a first step and does not require completion of any earlier tutorial. Readers who want to build directly on this foundation may proceed to:
- [Basic Concepts in Quantitative Research](/tutorials/basicquant/basicquant.html)
- [Descriptive Statistics](/tutorials/dstats/dstats.html)
:::
::: {.callout-note}
## Citation
```{r citation-callout-top, echo=FALSE, results='asis'}
cat(
params$author, ". ",
params$year, ". *",
params$title, "*. ",
params$institution, ". ",
"url: ", params$url, " ",
"(Version ", params$version, ").",
sep = ""
)
```
:::
---
# Part 1: Why We Need Science {#part1}
::: {.callout-note}
## Section Overview
**What you will learn:** Why pure logical reasoning cannot answer empirical questions; why careful observation alone is insufficient; and how human cognition is systematically biased in ways that make a disciplined scientific methodology necessary.
:::
## The problem with intuition {-}
### What science is {-}
Before addressing why science is necessary, it is worth establishing what it is.
::: {.callout-note}
## A working definition of science
**Science** is a methodological process used to acquire knowledge about the world based on empirical evidence.
The key components are:
- **Methodological**: Systematic and principled, not haphazard
- **Process**: Ongoing and self-correcting, not a fixed body of knowledge
- **Empirical**: Grounded in observation of reality, not pure speculation
- **About the world**: Concerned with how things actually are, not just how they could logically be
:::
### Why not just think about it? {-}
For some domains, reasoning alone works well. The **formal sciences** — logic and mathematics — proceed entirely through deduction:
```
Premise 1: Socrates is a human being
Premise 2: All humans are mortal
Conclusion: Therefore, Socrates is mortal
```
If the premises are true and the logic is valid, the conclusion must be true. No observation of Socrates is required.
**The problem is that logic cannot tell us which possible world is *our* world.** Consider three equally coherent possibilities:
```
Possible world 1: I raise my left arm after counting to 3
Possible world 2: I raise my right arm after counting to 3
Possible world 3: I raise neither arm after counting to 3
```
All three are logically possible. To know which one actually happened requires **empirical evidence** — observation of what occurred. (For the record: I counted to two and raised neither arm.)
### Why not just observe carefully? {-}
If we need evidence, why not simply observe the world attentively? Because human beings are systematically biased observers. The remainder of this tutorial demonstrates this problem in detail.
---
## Cognitive biases: how we get it wrong {-}
### Bias 1: Emotional reasoning over facts {-}
What we fear is often not what actually harms us. Two widely cited contrasts illustrate this:
**Strangers versus known contacts.** Our fear of strangers — sometimes called "stranger danger" — is vivid and pervasive. Yet the evidence consistently shows that most violence against children and adults occurs within families and among known contacts, not from strangers. The fear is misplaced, and the misplacement has real costs in how we direct protective attention.
**Sharks versus cows and mosquitoes.** Shark attacks are dramatic and memorable, and have been amplified by popular culture. Yet in the United States, cows kill roughly 20 people per year while sharks kill fewer than one on average. Globally, mosquitoes cause around 700,000 deaths annually through disease transmission. The asymmetry between fear and statistical risk is striking.
The explanation is that vivid, emotionally charged narratives override statistical information. Evolutionary pressures favoured quick emotional responses to salient threats over careful actuarial reasoning.
### Bias 2: Confirmation bias {-}
**Confirmation bias** is the tendency to seek out, interpret, and remember information in ways that confirm what we already believe, while ignoring or discounting contradictory evidence.
This bias is both pervasive and insidious: it affects experts as much as novices, operates even when we are trying to be objective, and reinforces existing beliefs — including incorrect ones — rather than correcting them. We will demonstrate it directly with the Wason Selection Task and the Number Sequence Puzzle in Part 3.
### Bias 3: Poor probabilistic intuition {-}
Most people are surprised by how consistently wrong their intuitions are when it comes to probability and statistics. Two classical demonstrations make this vivid.
---
## The Monty Hall Problem {-}
{ width=35% style="float:right; padding:15px" }
Monty Hall hosted the American television game show *Let's Make a Deal*. The game works as follows:
1. Three doors are presented. Behind two of them are goats; behind one is a prize.
2. The contestant chooses a door (say, Door 1).
3. The host, who knows where the prize is, opens a different door to reveal a goat (say, Door 3).
4. The host asks: "Do you want to switch to Door 2?"
::: {.callout-tip}
## Question: should you switch?
Think about this carefully before reading on. Most people have a strong intuition about the answer.
:::
**The intuitive answer** is that it does not matter — there are now two doors remaining, so the probability must be 50-50. **This is incorrect.**
**You should always switch.** Switching gives you a 2/3 probability of winning; staying gives you only 1/3.
::: {.callout-note}
## Why switching is correct
When you initially chose Door 1, you had a 1/3 chance of being right. Doors 2 and 3 *together* held a 2/3 chance of hiding the prize.
When Monty opens Door 3 (always revealing a goat, because he knows where the prize is), that 2/3 probability does not disappear — it concentrates entirely onto Door 2. Door 1 still has only its original 1/3 probability.
| Door | Before Monty opens Door 3 | After Monty opens Door 3 |
|---|---|---|
| Door 1 (your choice) | 1/3 | 1/3 |
| Door 2 | 1/3 | **2/3** |
| Door 3 | 1/3 | 0 (revealed as goat) |
The key insight is that Monty's action is *not random* — he always opens a losing door. That constraint is what transfers probability.
:::
**A more transparent version: 20 doors.** Imagine 20 doors instead of 3. You pick Door 1 (1/20 chance of winning). Monty then opens 18 doors, all revealing goats, leaving one other door closed. Would you switch? Almost everyone would — it is obvious that the 19/20 probability has concentrated onto that one remaining door. The logic with 3 doors is identical, just less intuitively obvious.
You can verify this empirically using an [online Monty Hall simulation](https://www.mathwarehouse.com/monty-hall-simulation-online/). Running 100 trials with each strategy consistently produces roughly 33% wins when staying and 67% wins when switching.
---
## The Birthday Problem {-}
::: {.callout-tip}
## Question: how many people?
How many people need to be in a room for there to be a 50% chance that two of them share a birthday? Think about your answer before reading on.
:::
Most people guess something around 100 or even 183 (half of 365). **The correct answer is only 23.** With 23 people, the probability that at least two share a birthday is 50.7%.
The calculation is most easily approached by computing the complement — the probability that **all** 23 people have *different* birthdays:
```
Person 1: 365/365 (any birthday is fine)
Person 2: 364/365 (must differ from person 1)
Person 3: 363/365 (must differ from persons 1 and 2)
...
Person 23: 343/365 (must differ from all 22 others)
P(all different) = (365 × 364 × 363 × ... × 343) / 365^23
= 0.4927
P(at least one match) = 1 - 0.4927 = 0.5073
```
```{r birthday, eval=FALSE}
# Verify in R
n <- 23
days_in_year <- 365
prob_all_different <- prod((days_in_year - 0:(n - 1)) / days_in_year)
prob_match <- 1 - prob_all_different
prob_match
# [1] 0.5072972
```
With 73 people, the probability of a shared birthday exceeds 99.999%.
```{r birthday73, eval=FALSE}
n <- 73
prob_all_different <- prod((days_in_year - 0:(n - 1)) / days_in_year)
prob_match <- 1 - prob_all_different
prob_match
# [1] 0.9999919
```
The lesson is that we systematically underestimate how quickly probabilities accumulate — particularly with combinatorial calculations. We are reasonably good at linear arithmetic but very poor at reasoning about exponential growth and compound probabilities. This is one of many reasons why statistical analysis cannot be replaced by intuition.
---
## Fast and slow thinking {-}
{ width=30% style="float:right; padding:15px" }
### The ball and bat problem {-}
::: {.callout-tip}
## Question
A ball and a bat together cost $1.10. The bat costs $1.00 more than the ball. How much does the ball cost?
:::
Most people immediately answer "10 cents." This is wrong. If the ball costs 10 cents, the bat costs $1.10, and the total is $1.20 — not $1.10.
The correct answer is **5 cents**: ball = $0.05, bat = $1.05, total = $1.10.
### System 1 and System 2 {-}
The psychologist Daniel Kahneman distinguishes two modes of cognition [@kahneman2011fast]:
**System 1 (fast thinking)** operates automatically and effortlessly. It generates intuitive responses based on pattern recognition and association. It is fast and requires no conscious effort — but it regularly produces errors on problems that require careful reasoning.
**System 2 (slow thinking)** is deliberate, effortful, and analytical. It applies logical rules and checks its own work. It is more reliable but requires cognitive effort that we often avoid expending.
The ball and bat problem shows System 1 in action: it generates "10 cents" almost instantly because the numbers $1.00 and $0.10 are salient and combine to give a plausible total. System 2, if engaged, immediately detects the error — but System 1 answers first and System 2 tends to be lazy about checking plausible-seeming answers.
::: {.callout-important}
## Key insight: science as institutionalised System 2 thinking
Science can be understood as a set of institutional and methodological procedures designed to force deliberate, effortful, System 2 reasoning. Peer review, pre-registration, replication, controlled experiments, and statistical testing are all mechanisms for preventing the fast, intuitive, and frequently wrong conclusions of System 1 from being accepted as knowledge. Science is expensive in time and effort — but it produces more reliable knowledge precisely because of that cost.
:::
---
## Seeing patterns in randomness {-}
### Skinner's superstitious pigeons {-}
In a classic experiment, B. F. Skinner (1948) placed pigeons in boxes where food was delivered at random intervals, with no connection to anything the pigeon did. The result was that each pigeon developed idiosyncratic repetitive behaviours — one turned in circles, another pecked at corners of the box — which it had happened to be performing when food arrived by chance.
{ width=60% style="float:center; padding:10px" }
The pigeons had assumed a causal connection between their behaviour and the food reward, even though the delivery was entirely random. Each accidental co-occurrence reinforced the behaviour, creating what Skinner called "superstitious" conditioning.
Human superstitions operate by the same mechanism. Athletes who perform well while wearing a particular item of clothing begin treating that item as a causal agent. Gamblers develop "systems" based on perceived patterns in random sequences. In all cases, the cognitive machinery evolved to detect genuine patterns in the environment applies itself inappropriately to random co-occurrences.
**Why this matters for research:** The same tendency that creates superstition in pigeons and humans can create false patterns in data. If you run enough analyses on a dataset, some will produce significant results by chance alone. This is one reason why hypotheses should be specified before data collection (pre-registration), not inferred from the data retrospectively.
### Pareidolia: seeing faces everywhere {-}
**Pareidolia** is the perception of meaningful patterns — especially faces — in random or ambiguous stimuli. Famous examples include the "Face on Mars" photographed by Viking 1 in 1976 (later shown to be an ordinary rock formation under different lighting), apparent religious figures in food burn marks, and the "Man in the Moon" (with different cultures perceiving different figures in the same lunar surface).
The evolutionary explanation (Bruce Hood, Cardiff University) is straightforward. The ability to quickly detect faces — and particularly to distinguish friend from foe, safe from threatening — was highly adaptive. The cost of a **false negative** (failing to detect a real face when one is present) was potentially severe: missing a predator or failing to recognise an enemy. The cost of a **false positive** (seeing a face where there is none) was low: a momentary misperception with no lasting consequence. Evolution therefore favoured an over-sensitive face-detection system, and we inherit the result.
### The sweater experiment {-}
::: {.callout-tip}
## Thought experiment
A professor offers you $10 to wear a sweater for one minute. Would you accept?
Most people would. Now consider an additional detail: the sweater previously belonged to a convicted serial killer. Does this change your answer?
:::
Many people become reluctant, or feel discomfort even if they would still accept. Rationally, the sweater is just cloth — its history carries no physical trace that could harm the wearer. Yet the feeling of contamination is real and difficult to dismiss by reasoning.
The evolutionary explanation mirrors that for pareidolia. Ancestors who avoided objects associated with disease, death, or dangerous individuals were at a genuine survival advantage — contaminated objects *can* carry pathogens. The emotional response of disgust and avoidance was adaptive. Today, that same response activates in contexts where it no longer makes adaptive sense but where we inherited the tendency nonetheless.
---
## The anthropocentric bias {-}
**Anthropocentric bias** (sometimes called experiential realism) is the assumption that the world appears to all organisms as it appears to us — that our perceptual experience constitutes, rather than merely filters, reality.
Consider human versus bee vision. Humans perceive light in the wavelength range of approximately 400–700 nanometres. Bees perceive roughly 300–650 nm, which includes ultraviolet light but excludes red (which appears black to them). The practical consequence is that flowers look dramatically different to bees than to us: many flowers have ultraviolet patterns that guide bees to nectar but are completely invisible to human eyes.
```{r bee-vision, eval=TRUE, echo=FALSE, fig.width=10, fig.height=5, warning=FALSE, message=FALSE}
p1 <- ggdraw() + draw_image("images/flower1.png", scale = 0.8)
p2 <- ggdraw() + draw_image("images/sky.png", scale = 1.3)
plot_grid(p1, p2,
labels = c("Human view of meadow",
"Bee view (schematic): blossoms\nstand out like stars in night sky"))
```
The philosopher-linguists Evans and Green put this well:
> "However, the parts of this external reality to which we have access are largely constrained by the ecological niche we have adapted to and the nature of our embodiment. In other words, language does not directly reflect the world. Rather, it reflects our unique human construal of the world: our 'world view' as it appears to us through the lens of our embodiment."
>
> — @evans2006cognitive [p. 46]
The implications for research are significant. Any science that takes human perception as a transparent window onto reality — rather than as one evolved, partial, species-specific perspective on it — will systematically reproduce the biases of that perspective. This is a further argument for why we need systematic, instrument-mediated, and community-checked science rather than just careful personal observation.
### The afterimage demonstration {-}
::: {.callout-tip}
## Try this
1. Stare at the dot between the red and green squares for 30 seconds without looking away.
2. Immediately shift your gaze to the dot between the sand dunes.
```{r afterimage, eval=TRUE, echo=FALSE, fig.width=6, fig.height=4, warning=FALSE, message=FALSE}
ggdraw() + draw_image("images/redgreen.png", scale = 0.9)
```
```{r desert, eval=TRUE, echo=FALSE, fig.width=6, fig.height=4, warning=FALSE, message=FALSE}
ggdraw() + draw_image("images/desert.png", scale = 0.9)
```
:::
**What happens:** After staring at the red square, the left portion of the dunes appears greenish. After staring at the green square, the right portion appears reddish.
**Why:** The red-sensitive and green-sensitive photoreceptors in your retina become temporarily fatigued (depleted of neurotransmitter). When you look at the neutral sand, the fatigued cells fire less strongly, so the complementary colour dominates. What you "see" is not simply what is there — it is the output of a neurophysiological process that is itself subject to fatigue, context, and prior stimulation.
### Gestalt perception {-}
Gestalt psychology (from the German word for "form" or "shape") studies how we perceive unified wholes from collections of parts. Several classic principles demonstrate that perception is an active, constructive process, not a passive recording of stimulation.
{ width=25% style="float:center; padding:10px" }
The Kanizsa triangle above contains no actual triangle — there are three Pac-Man shapes and three angle markers. Yet virtually everyone perceives a bright white triangle overlaying the other elements. The brain constructs the missing contours from partial information, using the principle of **closure** (completing incomplete shapes).
When the same elements are rearranged, the triangle disappears and three Pac-Man shapes appear instead:
{ width=25% style="float:center; padding:10px" }
Same elements, different arrangement — radically different perception. Other Gestalt principles include **proximity** (nearby items are perceptually grouped), **similarity** (similar items are grouped), **continuity** (smooth lines are preferred over sharp changes), and **common fate** (items moving together are grouped).
All of these principles demonstrate the same point: perception is not a record of the external world but a construction that the brain generates based on partial information, prior expectations, and evolved heuristics.
---
## Context effects on perception {-}
### The Thatcher illusion {-}
```{r thatcher-up, eval=TRUE, echo=FALSE, fig.width=5, fig.height=3.5, warning=FALSE, message=FALSE}
ggdraw() + draw_image("images/thatcher4.png", scale = 0.9)
```
Look at the two upside-down faces above. One may seem slightly unusual, but both appear roughly human and recognisable.
Now look at the same images right-side-up:
```{r thatcher-down, eval=TRUE, echo=FALSE, fig.width=5, fig.height=3, warning=FALSE, message=FALSE}
ggdraw() + draw_image("images/thatcher3.png", scale = 0.9)
```
The distortion — eyes and mouth inverted relative to the face — that was barely noticeable upside-down is now grotesque and immediately obvious.
**Why:** When a face is inverted, the brain does not deploy its specialised face-processing system; it processes the image as a generic object. Local distortions go unnoticed. When the face is right-side-up, the full face-processing architecture activates, and the mismatch between the expected face template and the actual distorted image is immediately detectable. Context (orientation) determines which perceptual processing system is recruited, and that choice determines what we see.
### The B/13 illusion {-}
The ambiguous figure below illustrates how context determines categorical perception:
{ width=35% style="float:center; padding:10px" }
The middle symbol in the alphabetic sequence A, B, C is typically read as the letter "B."
{ width=35% style="float:center; padding:10px" }
The same symbol in the numeric sequence 12, 13, 14 is typically read as the number "13."
The physical stimulus is identical in both cases. What changes is the context, which activates different prior expectations and determines which categorical interpretation the perceiver reaches. **The same stimulus produces different perceptions depending on its context.** This has direct implications for linguistics: the same linguistic form can carry different meanings in different contexts, and we cannot study meaning without studying context.
---
::: {.callout-tip}
## Exercises: Cognitive Biases and Perception
:::
**Q1. The Monty Hall problem reveals a systematic failure of probabilistic intuition. The core of the correct solution is that Monty's action is not random. Which statement best captures why this matters?**
```{r}
#| echo: false
#| label: "BIAS_Q1"
check_question(
"Monty always opens a losing door (he has knowledge of where the prize is), so his action transfers probability from the opened door to the remaining unchosen door — it is not a random 50-50 split.",
options = c(
"Monty always opens a losing door (he has knowledge of where the prize is), so his action transfers probability from the opened door to the remaining unchosen door — it is not a random 50-50 split.",
"Switching is always better because you are simply choosing a new door with no prior information.",
"The probability changes because Monty has removed one of the three options, making the remaining two equally likely.",
"It does not matter — with only two doors remaining, the probability is always 50-50 regardless of Monty's knowledge."
),
type = "radio",
q_id = "BIAS_Q1",
random_answer_order = TRUE,
button_label = "Check answer",
right = "Correct! The key is that Monty's action is constrained: he never opens the door you chose, and he never opens the door hiding the prize. This means his action carries information. The 2/3 probability that was spread across the two doors you did not choose collapses entirely onto the one remaining door — it does not split evenly. If Monty opened doors at random (and sometimes revealed the prize), the problem would be different.",
wrong = "Focus on what makes Monty's action non-random. He knows where the prize is and acts on that knowledge. Does that constraint affect how probability should be distributed after he acts?"
)
```
**Q2. Pareidolia and Skinner's pigeon experiments both illustrate the same underlying cognitive tendency. What is it?**
```{r}
#| echo: false
#| label: "BIAS_Q2"
check_question(
"The tendency to perceive meaningful patterns — faces, causal connections, intentional agency — even in random or coincidental stimuli.",
options = c(
"The tendency to perceive meaningful patterns — faces, causal connections, intentional agency — even in random or coincidental stimuli.",
"The tendency to form emotional attachments to objects that have been associated with rewards.",
"The tendency to misremember past events in ways that confirm current beliefs.",
"The tendency to overestimate the frequency of events that are emotionally significant."
),
type = "radio",
q_id = "BIAS_Q2",
random_answer_order = TRUE,
button_label = "Check answer",
right = "Correct! Both phenomena stem from the same evolved tendency: the brain is a pattern-detection machine that is tuned to be oversensitive rather than undersensitive, because the cost of missing a real pattern (predator, social threat, causal regularity) is higher than the cost of falsely detecting one. Pareidolia applies this to perceptual stimuli (faces in clouds); the pigeon experiment shows it in behaviour (assuming a causal link between an action and a reward that were merely temporally contiguous).",
wrong = "Think about what both phenomena have in common at a mechanistic level. In both cases, the organism perceives something that is not really there — a structure or connection in what is actually random. What general tendency explains this?"
)
```
---
# Part 2: Logical Fallacies {#part2}
::: {.callout-note}
## Section Overview
**What you will learn:** The most common logical fallacies encountered in academic discourse, media, and everyday argumentation — what they are, why they are fallacious, and how to recognise and counter them.
:::
## What are logical fallacies? {-}
::: {.callout-note}
## Definition
A **logical fallacy** is a pattern of argument that appears persuasive but contains a fundamental flaw in reasoning. Logical fallacies are not merely weak arguments — they are systematically invalid in a way that can be precisely identified.
Recognising logical fallacies matters because they are pervasive in public discourse, because everyone is susceptible to them (including trained researchers), and because they prevent accurate conclusions and undermine rational debate. Being able to name and explain a fallacy is not merely an academic exercise: it is a practical tool for evaluating claims.
:::
---
## The ten most important fallacies {-}
### 1. Confirmation bias and cherry-picking {-}
**What it is:** Selectively seeking out, reporting, or emphasising evidence that supports a preferred conclusion while ignoring or discounting contradictory evidence.
**Example:**
```
Claim: "Vaccines cause autism!"
Evidence cited: 1 study (subsequently retracted for scientific fraud) that found a link
Evidence ignored: 100+ subsequent independent studies that found no link
```
**Why it is a fallacy:** The strength of evidence lies in its totality, not in the existence of at least one supporting study. Every scientific question can find at least one study pointing in any direction; what matters is the weight and quality of the full body of evidence.
**Scientific solution:** Pre-register analysis plans before collecting data; report all results including negative ones; conduct systematic reviews and meta-analyses that pool evidence across studies.
---
### 2. Ad hominem (attack the person) {-}
**What it is:** Attacking the character, credentials, or motives of a person making an argument rather than addressing the argument itself.
**Examples:**
- "You can't trust their climate research — they are a leftist."
- "His statistics are wrong because he is funded by industry."
- "She is just saying that because she is young and naive."
**Why it is a fallacy:** A person's character, political affiliation, or funding source does not determine whether their argument is logically valid or their evidence reliable. These are separate questions. An argument must be evaluated on its own merits.
**Correct approach:** Identify specific methodological or logical flaws in the argument itself. If funding bias is a concern, examine whether the methods and conclusions are appropriate — not whether the funding source is ideologically convenient.
---
### 3. Appeal to authority {-}
**What it is:** Citing a person's authority or expertise as the *sole* justification for accepting a claim, without engaging with the evidence or reasoning behind it.
**When it is not a fallacy:** Citing a researcher's work in the sense of engaging with their evidence and methods is entirely appropriate. "According to Smith et al. (2020), who found X using method Y..." is legitimate evidence-based reasoning.
**When it is a fallacy:**
- "Einstein said it, so it must be true!" — This is the person's opinion, not the evidence.
- "Dr. X claims treatment Y works, and she is an expert!" — Expertise confers credibility but not infallibility.
**Key distinction:** An authority's *evidence and reasoning* can be cited as support; an authority's *opinion alone* is not evidence.
---
### 4. Straw man {-}
**What it is:** Misrepresenting an opponent's position — usually by exaggerating or oversimplifying it — in order to attack the weaker, distorted version rather than the actual argument.
**Example:**
```
Person A: "We should have some regulations on firearms to reduce violence."
Person B: "You want to ban all guns and leave people completely defenceless!"
```
Person A said nothing about banning all guns. Person B has constructed a distorted version ("straw man") of the argument because it is easier to defeat than the actual position.
**Why it is called "straw man":** A straw man is easy to knock down, unlike a real opponent. Winning against a straw man creates the appearance of having refuted the real argument without having engaged with it.
---
### 5. Argument from ignorance {-}
**What it is:** Claiming that a proposition is true because it has not been proven false (or vice versa). Treating absence of evidence as evidence of absence — or, more commonly in practice, as evidence of presence.
**Examples:**
- "No one has proven aliens do not exist, so they must be real."
- "Science cannot explain consciousness, therefore it must be supernatural."
**Why it is wrong:** Absence of evidence is not, in general, evidence of absence. There are many things that have not yet been investigated. The appropriate response to insufficient evidence is to remain agnostic — to say "we do not yet know" — not to fill the gap with a preferred explanation.
**Correct reasoning:** Maintain that the burden of proof lies with the person making the positive claim. Absence of disproof does not confirm the claim; it merely leaves it untested.
---
### 6. False dichotomy {-}
**What it is:** Presenting a situation as though only two options exist, when in fact more are available — typically by framing the two extreme positions as the only possibilities.
**Examples:**
- "America: love it or leave it."
- "You are either with us or against us."
- "Either we cut all social programmes or the economy collapses."
**Why it is manipulative:** It forces a choice between extremes, eliminates middle ground and compromise, and polarises discussion by making nuanced positions invisible.
---
### 7. Slippery slope {-}
**What it is:** Claiming that one action will inevitably lead, through a chain of steps, to an extreme and undesirable outcome — without providing evidence that the causal chain would actually operate.
**Examples:**
- "If we allow same-sex marriage, next people will marry animals."
- "If we ban one type of gun, soon they will ban all guns."
**When it is legitimate:** When there is actual evidence that each step in the chain follows predictably from the previous one, a slope argument may be valid. The fallacy lies in asserting the chain without that evidence.
**When it is a fallacy:** When the argument relies on fear of an extreme outcome rather than on evidence that the intermediate steps are likely.
---
### 8. Circular argument (begging the question) {-}
**What it is:** An argument in which the conclusion is already contained in, or assumed by, one of the premises. The argument appears to provide evidence for its conclusion but actually just restates the same claim in different words.
**Examples:**
- "The Bible is true because it says so in the Bible."
- "I am trustworthy because I say I am trustworthy."
**Why it fails:** No new information is added. If you accept the premise, you have already accepted the conclusion. The argument provides no independent reason to believe the conclusion is true.
**Valid structure:** Independent premises lead through explicit reasoning to a conclusion that was not already assumed in the starting point.
---
### 9. Red herring {-}
**What it is:** Introducing irrelevant information to distract from the actual question or issue under discussion.
**Example:**
```
Journalist: "Why did the government waste millions on this failed project?"
Politician: "Let me tell you about all the great schools we have built.
Education is so important, do you not agree?"
```
The politician has not addressed the question of the waste. Instead, they have introduced a different — and more politically comfortable — topic.
**Why it works:** People naturally follow new conversational directions, and the original question is easy to lose track of, especially in spoken discourse.
---
### 10. Sunk cost fallacy {-}
**What it is:** Continuing to invest resources (time, money, effort) in something because of what has already been invested, even when the future expected costs outweigh the future expected benefits.
**Examples:**
- Watching a film to the end even though you stopped enjoying it hours ago, because you have already invested two hours.
- Continuing to fund a research project that has clearly failed because substantial resources have already been committed.
**Why it is irrational:** Past costs are irretrievable. They cannot be recovered and are therefore irrelevant to the decision about what to do next. The only rational question is: given the current situation, do the expected future benefits outweigh the expected future costs?
**Rational approach:** Evaluate each decision forward-looking only. Ask: if I were starting from scratch with no prior investment, would I begin this? If no, the sunk cost fallacy may be operating.
---
## Why fallacies matter for science {-}
::: {.callout-important}
## Fallacies undermine knowledge
Without awareness of logical fallacies, researchers and readers reach wrong conclusions, waste resources, defend indefensible positions, and spread misinformation — even in good faith.
Science provides the institutional antidote: peer review catches ad hominem and cherry-picking; pre-registration counters confirmation bias; the requirement to engage with the strongest version of opposing theories counters straw man arguments; and the norm of reporting negative results counters selective reporting.
Recognising fallacies in one's own thinking is harder than recognising them in others' — but it is the more important skill.
:::
---
## Testing your understanding: the Wason Selection Task {-}
You see four cards. Each card has a letter on one side and a number on the other side. The visible faces are:
```
Card 1: A Card 2: K Card 3: 2 Card 4: 7
```
**The rule:** "If there is a vowel on one side of a card, then there is an even number on the other side."
::: {.callout-tip}
## Which cards must you turn over to test whether the rule is true or false?
Think carefully. You need to choose the minimum set of cards that could definitively falsify the rule.
:::
The most common answer is Cards 1 and 3 (A and 2). **This is incorrect.**
**The correct answer is Cards 1 and 4 (A and 7).**
- **Card 1 (A):** Must be turned over. It is a vowel, so the rule requires an even number on the reverse. If there is an odd number, the rule is false.
- **Card 2 (K):** Does not need to be turned over. The rule says nothing about what must be on the reverse of consonants.
- **Card 3 (2):** Does not need to be turned over. The rule says nothing about what must be on the reverse of even numbers — a vowel or a consonant on the reverse would both be compatible with the rule.
- **Card 4 (7):** Must be turned over. If there is a vowel on the reverse, the rule is violated (the vowel would require an even number on its reverse, but we have an odd number).
**What this demonstrates:** Most people turn over cards that *confirm* the rule (vowel, even number) rather than cards that could *falsify* it (vowel?, odd number). This is confirmation bias operating in a purely logical context. Scientific thinking requires actively seeking evidence that could prove you wrong, not just evidence consistent with your hypothesis.
---
## Testing your understanding: the number sequence puzzle {-}
Here are three numbers that follow a rule I have in mind:
```
1 2 4
```
You may propose one additional number, and I will tell you whether it follows my rule. What number would you choose, and what rule do you hypothesise?
::: {.callout-tip}
## Think before reading on
What is your hypothesis about the rule? What number would best test it?
:::
**Typical responses:** Most people guess 8 (following the "doubling" hypothesis) or 16 (following the "squaring" hypothesis). Both follow the rule — but neither tests whether the hypothesis is correct.
**A better strategy:** Propose a number that your hypothesis predicts would *not* follow the rule — say, 3 or 7 or 10. If the rule is "each number is larger than the previous one" (which it is), then 3 would follow the rule, falsifying the doubling hypothesis.
**The actual rule:** "Each number must be larger than the previous number."
**What this demonstrates:** Confirmation bias again. When people think their hypothesis is "doubling," they propose numbers that would confirm it rather than numbers that would challenge it. But only by testing the boundaries of your hypothesis — by attempting to falsify it — can you distinguish your hypothesis from the many other hypotheses compatible with your initial evidence.
---
::: {.callout-tip}
## Exercises: Logical Fallacies
:::
**Q3. A politician responds to a question about rising crime rates by saying: "I am proud of the new schools this government has built, and education is the foundation of a safe society." Which fallacy does this illustrate?**
```{r}
#| echo: false
#| label: "FALL_Q1"
check_question(
"Red herring — the politician introduces a different topic (schools, education) to avoid addressing the original question about crime rates.",
options = c(
"Red herring — the politician introduces a different topic (schools, education) to avoid addressing the original question about crime rates.",
"Straw man — the politician has misrepresented the question about crime.",
"Appeal to authority — the politician invokes their own authority to deflect the question.",
"False dichotomy — the politician implies that education is the only solution to crime."
),
type = "radio",
q_id = "FALL_Q1",
random_answer_order = TRUE,
button_label = "Check answer",
right = "Correct! A red herring introduces irrelevant information — here, a new and more comfortable topic — to distract from the actual question. The politician has not addressed crime rates at all; they have pivoted to schools. The response is not a misrepresentation of the question (straw man), nor does it cite authority, nor does it present a false either/or choice.",
wrong = "Look at what the politician actually does: they do not address the question about crime rates at all. Instead, they shift the conversation to a different topic. Which fallacy involves introducing an irrelevant topic to avoid engaging with the actual issue?"
)
```
**Q4. A friend argues: "I have watched five seasons of this series and it has been mediocre throughout. I might as well finish it — I have already invested 40 hours." What is the flaw in this reasoning?**
```{r}
#| echo: false
#| label: "FALL_Q2"
check_question(
"Sunk cost fallacy — the 40 hours already spent are irretrievable and should not affect the decision about whether to continue. The only rational question is whether the remaining episodes are worth the time they will take.",
options = c(
"Sunk cost fallacy — the 40 hours already spent are irretrievable and should not affect the decision about whether to continue. The only rational question is whether the remaining episodes are worth the time they will take.",
"False dichotomy — the friend assumes the only options are finishing or abandoning the series.",
"Slippery slope — the friend assumes that stopping here will lead to never finishing any series.",
"Circular argument — the friend's premise already assumes the conclusion that finishing is worthwhile."
),
type = "radio",
q_id = "FALL_Q2",
random_answer_order = TRUE,
button_label = "Check answer",
right = "Correct! The 40 hours are a sunk cost — they are already spent and cannot be recovered. The correct question is forward-looking: are the remaining episodes worth watching given what they will cost in time and what enjoyment (or lack thereof) they are likely to provide? Past investment is rationally irrelevant to future decisions. The same fallacy operates in research when we continue a failing project because of the resources already committed rather than because future prospects justify continuation.",
wrong = "Focus on the phrase 'I have already invested 40 hours.' What type of cost is a cost that has already been incurred and cannot be recovered? And is that cost logically relevant to the decision about what to do next?"
)
```
---
# Part 3: What Is Science? {#part3}
::: {.callout-note}
## Section Overview
**What you will learn:** A comprehensive definition of science; the distinction between empirical and formal sciences; the scientific method as a cycle of hypothesis-testing; the Clever Hans case study as a demonstration of why methodology matters; and Popper's principle of falsification as the criterion that separates scientific from non-scientific claims.
:::
## Defining science {-}
The working definition given in Part 1 can now be made more precise:
::: {.callout-note}
## Science: comprehensive definition
**Science** is an unbiased, fundamentally methodological enterprise that aims at building and organising knowledge about the empirical world in the form of falsifiable explanations and predictions, by means of systematic observation and experimentation.
The key components are:
1. **Unbiased**: Systematic checks against the cognitive biases documented in Parts 1 and 2
2. **Methodological**: Follows principled, replicable procedures
3. **Empirical**: Based on observation of reality, not pure reasoning
4. **Falsifiable**: Claims can in principle be proven wrong by evidence
5. **Explanatory**: Accounts for why patterns occur, not just that they occur
6. **Predictive**: Generates testable predictions about what will be observed
7. **Observational**: Depends on careful, instrument-mediated measurement
8. **Experimental**: Tests hypotheses through controlled manipulation
:::
## Types of science {-}
**Empirical sciences** examine phenomena of reality through the scientific method. Their goal is to explain and predict what actually exists and occurs. Examples include biology, physics, chemistry, psychology, sociology, and linguistics. Their method involves observing reality, forming and testing hypotheses, and refining theories in response to evidence.
**Formal sciences** examine abstract systems through axiomatic reasoning. Their goal is logical coherence and internal consistency. Examples include mathematics, formal logic, theoretical computer science, and formal linguistics. Their method involves starting from axioms, applying logical operations, and deriving theorems. Crucially, formal sciences can *prove* their results — because their claims concern abstract objects defined by their own axioms.
The key difference is epistemological: formal sciences can establish truths by logical proof; empirical sciences cannot prove — they can only test and potentially falsify. As Popper showed, this asymmetry between proof and falsification is fundamental to understanding how science works.
---
## The scientific method {-}
{ width=55% style="float:right; padding:10px" }
Science does not proceed in a straight line from observation to truth. It is a **cycle** of hypothesis formation, testing, revision, and renewed testing — a continuous self-correcting process.
The basic steps are:
1. **Observe a phenomenon** and notice something requiring explanation
2. **Formulate a research question** — make it specific and tractable
3. **Review existing literature** — what is already known?
4. **Form a hypothesis (H₁)** — a testable prediction based on prior observation and theory
5. **Form a null hypothesis (H₀)** — the position that there is no effect or difference; what we try to disprove
6. **Determine significance level** — how certain must we be to reject H₀? (Typically α = .05)
7. **Design the study** — how will you collect data to test the hypothesis?
8. **Collect data** — execute the design
9. **Analyse the data** — apply statistical tests; calculate effect sizes
10. **Draw conclusions** — can you reject H₀? What does this imply for H₁?
11. **If H₀ cannot be rejected** — form a new hypothesis and repeat
### Example: finding lost keys {-}
The abstract steps become concrete with a trivial everyday example:
```
Observation: My keys are missing.
Question: Where are my keys?
Literature: I have left them on the TV table before.
H₁: My keys are on the TV table.
H₀: My keys are NOT on the TV table.
Design: I will check the TV table.
Data: I checked — no keys there.
Analysis: H₀ cannot be rejected.
Conclusion: My keys must be elsewhere.
New H₁: My keys are in my coat pocket.
[Repeat]
```
This trivial example captures the logic that applies to the most sophisticated experiments.
---
## Clever Hans: a case study in methodology {-}
{ width=45% style="float:right; padding:10px" }
### The phenomenon {-}
Between 1891 and 1904, a horse named Clever Hans became famous across Europe for apparently being able to perform arithmetic, answer questions in German, spell words, and tell the time. His owner, Wilhelm von Osten, would ask questions and Hans would tap his hoof the correct number of times. Multiple scientific commissions investigated and found no evidence of fraud. Von Osten appeared to genuinely believe in his horse's abilities.
### The investigation {-}
The psychologist Oskar Pfungst (1907) took a more systematic approach. He designed controlled experiments varying two factors: whether the questioner knew the correct answer, and whether Hans could see the questioner.
| Condition | Result |
|---|---|
| Questioner knows answer + Hans can see questioner | Hans answers correctly |
| Questioner does not know answer | Hans cannot answer |
| Hans can see questioner | Hans answers correctly |
| Hans cannot see questioner (blinders) | Hans cannot answer |
The pattern was unambiguous: Hans's performance depended entirely on whether he could see someone who knew the answer.
### The discovery {-}
Pfungst found that questioners unconsciously provided micro-cues that Hans had learned to read. When asking a question requiring a numerical tap count, the questioner would unconsciously tense up; as Hans approached the correct number, the questioner would relax slightly. Hans had learned to start tapping at the tensing cue and stop at the relaxing cue — appearing to know the answer when he was actually reading involuntary muscle movements.
::: {.callout-important}
## Lessons from Clever Hans
1. **Appearances deceive:** Even trained scientists were fooled by systematic observation without proper controls.
2. **Belief bias:** Questioners who believed in Hans tended to confirm their belief through uncritical observation.
3. **Unintentional cuing:** Von Osten was not deceiving anyone — he gave the cues entirely without awareness.
4. **The need for controls:** Only a systematic design that manipulated questioner knowledge and visibility could reveal the truth.
5. **Experimenter effects:** The observer's expectations can influence the outcome of an observation or experiment — a finding that motivates double-blind experimental designs.
The term **Clever Hans effect** now refers to any situation in which an experimenter's unconscious behaviour influences a subject's responses, and serves as a reminder of why blinding and systematic methodology are not merely bureaucratic requirements but essential safeguards against self-deception.
:::
---
## Popper and falsification {-}
{ width=25% style="float:right; padding:10px" }
### The problem of induction {-}
The Austrian-British philosopher Karl Popper (1902–1994) identified a fundamental problem with the traditional view of science as proceeding from many observations to general laws:
```
Traditional view:
Observation 1: Swan 1 is white
Observation 2: Swan 2 is white
...
Observation 10,000: Swan 10,000 is white
↓
Law: All swans are white
```
**Popper's insight:** No number of confirming observations can *prove* a universal generalisation true. No matter how many white swans you observe, the 10,001st swan might be black. And indeed, when Europeans arrived in Australia they encountered black swans — observations that immediately falsified the "all swans are white" generalisation that had seemed secure for centuries.
But notice the asymmetry: **a single black swan is sufficient to refute the universal claim.** While we cannot verify by accumulating positive evidence, we can — and must — test by seeking negative evidence.
### Falsification as the criterion of science {-}
::: {.callout-important}
## Popper's criterion
**A theory is scientific if and only if it is falsifiable.**
A theory is falsifiable when it is possible to describe, in advance, what kind of observation would prove it wrong. Falsifiable theories take an empirical risk: they stake out a position that could be contradicted by evidence.
A theory that is compatible with every possible observation is not scientific — not because it is necessarily false, but because it cannot be tested and therefore cannot be part of the self-correcting process that constitutes science.
:::
**Falsifiable (scientific) examples:**
- "All swans are white" — falsified by a single non-white swan
- "The Earth orbits the Sun" — falsifiable by stellar parallax measurements
- "Smoking causes cancer" — falsifiable by epidemiological studies
**Not falsifiable (not scientific) examples:**
- "God exists" — no observation could definitively disprove this
- "Everything happens for a reason" — compatible with any possible outcome
- "This patient's symptoms are caused by repressed childhood memories" — can be interpreted to confirm the theory regardless of the patient's response
The last example points to Popper's famous critique of psychoanalysis: Freudian theory, he argued, is structured so that any conceivable behaviour can be interpreted as confirming it. A patient who is close to their mother confirms the Oedipal hypothesis; a patient who is distant from their mother confirms it too (they are "repressing" their feelings). A theory that cannot be falsified by any evidence is not a scientific theory — even if it happens to be true.
### Science as evolutionary progress {-}
Popper drew an analogy between science and biological evolution. In evolution, genetic variation is subjected to natural selection — variants that fit their environment survive; those that do not are eliminated. In science, theoretical variation (new hypotheses and conjectures) is subjected to empirical testing — theories that withstand attempts at falsification survive; those that do not are rejected. Both processes are progressive but not teleological: they eliminate what does not work without guaranteeing that what remains is final truth.
**Implications for research practice:**
- Ask of every hypothesis: "What would falsify this?"
- Design studies with the explicit goal of testing, not just confirming
- Treat a hypothesis that survives many serious attempts at falsification as well-corroborated, not as proven
- Treat a failed attempt to confirm as informative — a narrowing of the space of possibilities
---
## What is linguistics? {-}
::: {.callout-note}
## Linguistics
**Linguistics** is the scientific study of language and individual languages. Linguists aim to uncover, describe, explain, and model the systems that underlie human language use.
As an **empirical science**, linguistics studies language through systematic observation of real language use, tests hypotheses about linguistic structure and function, and produces falsifiable claims about how language works.
:::
**Descriptive versus prescriptive linguistics** illustrates the scientific/non-scientific distinction:
| Approach | Character | Example |
|---|---|---|
| **Descriptive (scientific)** | Describes what speakers actually do | "English speakers frequently use *ain't* in casual conversation" |
| **Prescriptive (non-scientific)** | Prescribes what speakers should do | "You should not say *ain't*" |
Prescriptive claims are not falsifiable in Popper's sense — they are normative, not empirical. Descriptive claims can be tested against corpus data and thus belong to the domain of science.
**Example: the scientific circle in linguistics**
```
Observation: Children appear to learn grammar without explicit instruction.
Question: How do children acquire language?
Literature: Chomsky's Universal Grammar hypothesis;
Tomasello's usage-based approach.
H₁: Children extract grammatical patterns through frequency tracking.
Design: Expose children to artificial language with manipulated
input frequencies; record which patterns they learn.
Data: Children's productions; error patterns; learning rates.
Analysis: Compare learning rates for high- versus low-frequency patterns.
Conclusion: Higher frequency predicts faster acquisition —
supports usage-based hypothesis.
Refinement: Test with different age groups, complexity levels.
[Repeat]
```
---
::: {.callout-tip}
## Exercises: The Scientific Method and Falsification
:::
**Q5. A researcher proposes the theory: "Students who feel positively about their lecturer will perform better on written assessments." Is this theory scientific in Popper's sense?**
```{r}
#| echo: false
#| label: "SCI_Q1"
check_question(
"Yes — the theory is falsifiable. One could measure student affect toward the lecturer and performance on assessments, and test whether the predicted positive association obtains. A study finding no significant association, or a negative association, would constitute evidence against the theory.",
options = c(
"Yes — the theory is falsifiable. One could measure student affect toward the lecturer and performance on assessments, and test whether the predicted positive association obtains. A study finding no significant association, or a negative association, would constitute evidence against the theory.",
"No — the theory is not falsifiable because student affect is a subjective variable that cannot be measured reliably.",
"No — the theory is not falsifiable because it is stated as a general tendency rather than a universal law.",
"Yes, but only if the researcher specifies in advance what effect size would count as a confirmation."
),
type = "radio",
q_id = "SCI_Q1",
random_answer_order = TRUE,
button_label = "Check answer",
right = "Correct! The theory is falsifiable because it makes a specific directional prediction (positive affect → better performance) that could be contradicted by data. The fact that affect is a psychological variable and therefore requires operationalisation does not make the theory unfalsifiable — many scientific variables require operationalisation. The theory would be falsified by a well-designed study finding no association or a negative one. Popper's criterion asks whether a theory *could in principle* be shown to be wrong; this one clearly could.",
wrong = "Popper's falsifiability criterion asks: can we describe an observation that would prove the theory wrong? For this theory, what would count as disconfirming evidence? If we can describe such evidence, the theory is falsifiable."
)
```
**Q6. A therapist argues: "If a patient denies having repressed childhood trauma, that itself shows how deeply it is repressed. If a patient acknowledges having difficult memories, that confirms the trauma theory." What is the scientific problem with this argument?**
```{r}
#| echo: false
#| label: "SCI_Q2"
check_question(
"The theory is unfalsifiable — both possible outcomes (denial and acknowledgement) are interpreted as confirming evidence. A theory that is consistent with every possible observation cannot be tested and therefore cannot be scientific in Popper's sense.",
options = c(
"The theory is unfalsifiable — both possible outcomes (denial and acknowledgement) are interpreted as confirming evidence. A theory that is consistent with every possible observation cannot be tested and therefore cannot be scientific in Popper's sense.",
"The theory commits a circular argument — the conclusion that trauma exists is assumed in the premise.",
"The theory commits a false dichotomy — it assumes patients either deny or acknowledge trauma, with no other possibilities.",
"The theory relies on an appeal to authority — it depends on the therapist's expert judgment rather than objective evidence."
),
type = "radio",
q_id = "SCI_Q2",
random_answer_order = TRUE,
button_label = "Check answer",
right = "Correct! This is precisely the structure Popper used to critique psychoanalysis. When both denial and acknowledgement are taken as confirmation of the same theory, there is no possible observation that could falsify it. The theory is therefore not scientific in Popper's sense — not because it is necessarily wrong, but because it places itself outside the reach of empirical testing. Compare this with a falsifiable alternative: 'Patients with repressed trauma will show higher physiological stress responses to trauma-related cues than matched controls' — this makes a specific, testable, falsifiable prediction.",
wrong = "The problem is not primarily about circular reasoning or false dichotomy, though those may also be present. Focus on what the structure of the argument implies for testability. If both possible outcomes confirm the theory, what kind of evidence could falsify it?"
)
```
---
# Part 4: Applying Scientific Thinking {#part4}
::: {.callout-note}
## Section Overview
**What you will learn:** How to apply the scientific method to real-world claims — including health claims, news reports, and unusual beliefs — and how to design a linguistics study from the ground up.
:::
## Applying the scientific circle to real claims {-}
### Ghost belief {-}
Given what we have covered, we can offer a scientific analysis of why people believe in ghosts — not a dismissal of those beliefs, but an explanation of the cognitive and perceptual mechanisms that generate such experiences in the absence of actual ghosts.
Several factors operate together:
**Pareidolia and agency detection** — the brain is primed to detect faces and intentional agents. In low light, in unfamiliar environments, or when anxious, ambiguous stimuli are more likely to be interpreted as presences.
**Confirmation bias** — people who believe in ghosts attend to and remember experiences that are consistent with that belief (unexplained sounds, feelings of being watched) and discount or forget the vast majority of experiences that have mundane explanations.
**Sleep paralysis** — during transitions in and out of REM sleep, it is possible to experience vivid hallucinations combined with an inability to move. This experience — including the sensation of a threatening presence in the room — is well-documented neurologically and has likely generated ghost and demon narratives across cultures.
**Infrasound** — sounds below the threshold of human hearing (below roughly 20 Hz) can produce feelings of unease, anxiety, and the sensation of an unseen presence. Old buildings with large resonant chambers sometimes produce infrasound.
**Emotional factors** — grief, sleep deprivation, and fear heighten the tendency to perceive meaningful patterns in ambiguous stimuli.
None of these explanations requires ghosts to exist. Together, they account for the full range of reported ghost experiences using well-understood mechanisms.
### Evaluating health claims {-}
**Claim: "Vitamin X cures cancer!"**
Applying scientific criteria:
1. **Is it falsifiable?** Yes — we could test on cancer patients with controls.
2. **What is the evidence?** Anecdotes? Observational studies? Randomised controlled trials?
3. **Is the sample adequate?** How many participants? What was the control condition?
4. **Were confounds controlled?** Diet, other treatments, disease severity?
5. **What is the effect size?** Statistically significant but practically trivial?
6. **Has it been replicated?** By independent groups without financial stake?
7. **Was it peer-reviewed?** Does it appear in a respected journal or in promotional material?
8. **Are there conflicts of interest?** Who funded the study?
An anecdote about one person who took the vitamin and recovered is not evidence in the relevant sense — because people recover from cancer without the vitamin, and we have no way of knowing what would have happened without it.
### Evaluating news claims {-}
**Headline: "Study shows chocolate improves memory!"**
Critical questions for any such claim:
1. Is the claimed relationship **causal** or merely **correlational**? (Studies showing that chocolate eaters have better memory may simply reflect that wealthier people eat more good-quality chocolate and also have better access to education and healthcare.)
2. Was there a **control group**?
3. Was the **sample size** adequate?
4. What was the **effect size** — is the improvement practically meaningful?
5. Who **funded** the study? (A study funded by a chocolate manufacturer requires particular scrutiny.)
6. Has it been **replicated** independently?
7. Is the result **consistent** with the broader body of research?
### Making personal decisions {-}
**Claim: "This quantum healing bracelet balances your body's energy."**
Applying scientific analysis:
1. **Falsifiability:** The claim is vague. "Balancing quantum energy" does not specify what would constitute evidence of success or failure.
2. **Mechanism:** There is no known biological mechanism by which wearing a bracelet could affect health through quantum effects. Quantum phenomena operate at sub-atomic scales, not at the scale of biological systems.
3. **Evidence:** Only testimonials — anecdotes subject to all the biases documented in this tutorial (confirmation bias, placebo effect, regression to the mean).
4. **Red flags:** Pseudoscientific vocabulary ("quantum," "energy balance") misappropriated from physics; claims only found in alternative-medicine contexts; no peer-reviewed trials.
5. **Conclusion:** The claim is not falsifiable as stated and lacks any plausible mechanistic basis. Extremely unlikely to produce the claimed effects.
---
## Designing a linguistics study {-}
::: {.callout-tip}
## Exercise: designing a study on spoken fluency
You want to investigate whether younger or older speakers of English differ in spoken fluency. How would you design this study scientifically?
Think through the following before reading the answer below.
:::
```{r}
#| echo: false
#| label: "APP_Q1"
check_question(
"Operationalise 'fluency' as a measurable variable (e.g. words per minute, rate of filled pauses, rate of self-corrections); define 'young' and 'old' with specific age ranges; recruit matched groups (controlling for education, health, language background); use a standardised elicitation task; record, transcribe, and code blind (coder does not know participant age); apply appropriate statistical tests; report effect sizes alongside significance tests.",
options = c(
"Operationalise 'fluency' as a measurable variable (e.g. words per minute, rate of filled pauses, rate of self-corrections); define 'young' and 'old' with specific age ranges; recruit matched groups (controlling for education, health, language background); use a standardised elicitation task; record, transcribe, and code blind (coder does not know participant age); apply appropriate statistical tests; report effect sizes alongside significance tests.",
"Ask a group of people informally whether they think older or younger speakers are more fluent and report the modal answer.",
"Record a few conversations with young and old speakers and note your impressions of who seemed more fluent.",
"Search for anecdotal reports online about young versus old speakers and compile a list of observations."
),
type = "radio",
q_id = "APP_Q1",
random_answer_order = TRUE,
button_label = "Check answer",
right = "Correct! A scientific study requires careful operationalisation of every key concept, controlled sampling to ensure the groups differ only on the variable of interest, a standardised procedure to ensure all participants are treated comparably, blind coding to prevent observer expectations from influencing judgements, and appropriate statistical analysis. Each of the alternative approaches is vulnerable to one or more of the cognitive biases and methodological problems documented in this tutorial: impressionistic assessment is subject to confirmation bias; anecdotal evidence is subject to selection bias and lack of controls.",
wrong = "Think about what a scientific approach requires at each stage: (1) precise definition of all key terms, (2) controlled sampling, (3) standardised procedure, (4) blind measurement, (5) statistical analysis. Which answer describes an approach that satisfies all of these requirements?"
)
```
---
# Quick Reference {.unnumbered}
## Cognitive biases checklist {-}
When evaluating evidence or making decisions, watch for:
- **Emotional reasoning** — feeling does not constitute fact
- **Confirmation bias** — the tendency to seek only supporting evidence
- **Poor probabilistic intuition** — systematic underestimation of compound and combinatorial probabilities
- **Pattern-seeking** — perceiving agency or causal structure in random co-occurrences
- **Pareidolia** — seeing faces and meaningful forms in ambiguous stimuli
- **Context effects** — perceptions are shaped by prior expectations and surrounding context
- **Anthropocentric bias** — assuming our perceptual experience constitutes rather than filters reality
## Logical fallacies to avoid {-}
- **Ad hominem** — attacking the person rather than the argument
- **Appeal to authority** — citing a person's opinion rather than their evidence
- **Straw man** — misrepresenting an opponent's position
- **False dichotomy** — presenting only two options when more exist
- **Slippery slope** — asserting an inevitable chain of consequences without evidence
- **Circular argument** — assuming what you are trying to prove
- **Red herring** — introducing an irrelevant topic to avoid the real issue
- **Sunk cost fallacy** — continuing due to prior investment rather than future prospects
- **Argument from ignorance** — treating absence of disproof as proof
- **Confirmation bias / cherry-picking** — selectively reporting supportive evidence
## Scientific method summary {-}
```
1. Observe → 2. Question → 3. Review literature →
4. Hypothesise (H₁ and H₀) → 5. Design → 6. Collect data →
7. Analyse → 8. Conclude → 9. Refine → [Repeat]
```
Key principles: falsifiable hypotheses; controlled observation; statistical analysis; peer review; replication.
## Evaluating claims {-}
Questions to ask of any empirical claim:
1. Is it falsifiable?
2. What is the evidence? (Anecdote? Observational study? Randomised trial?)
3. Was the sample size adequate?
4. Were proper controls included?
5. Were confounds addressed?
6. Is the effect size meaningful?
7. Has it been independently replicated?
8. Was it peer-reviewed?
9. Are there conflicts of interest?
10. Is it consistent with the broader body of evidence?
---
# Citation & Session Info {.unnumbered}
::: {.callout-note}
## Citation
```{r citation-callout, echo=FALSE, results='asis'}
cat(
params$author, ". ",
params$year, ". *",
params$title, "*. ",
params$institution, ". ",
"url: ", params$url, " ",
"(Version ", params$version, "), ",
"doi: ", params$doi, ".",
sep = ""
)
```
```{r citation-bibtex, echo=FALSE, results='asis'}
key <- paste0(
tolower(gsub(" ", "", gsub(",.*", "", params$author))),
params$year,
tolower(gsub("[^a-zA-Z]", "", strsplit(params$title, " ")[[1]][1]))
)
cat("```\n")
cat("@manual{", key, ",\n", sep = "")
cat(" author = {", params$author, "},\n", sep = "")
cat(" title = {", params$title, "},\n", sep = "")
cat(" year = {", params$year, "},\n", sep = "")
cat(" note = {", params$url, "},\n", sep = "")
cat(" organization = {", params$institution, "},\n", sep = "")
cat(" edition = {", params$version, "}\n", sep = "")
cat(" doi = {", params$doi, "}\n", sep = "")
cat("}\n```\n")
```
:::
```{r fin}
sessionInfo()
```
::: {.callout-note}
## AI Transparency Statement
This tutorial was revised and restyled with the assistance of **Claude** (claude.ai), a large language model created by Anthropic. All substantive content — examples, explanations, case studies, and reasoning — was retained from the original and reviewed and approved by Martin Schweinberger, who takes full responsibility for the tutorial's accuracy.
:::
[Back to top](#intro)
[Back to HOME](/index.html)
# References {.unnumbered}